Electronic Corpora: As Powerful Tools in Computational Linguistic Analyses
نویسنده
چکیده
Technology has emerged almost all the domains in our daily life. In computational linguistics, the uses of electronic corpora are very important. Nowadays it is possible to study linguistic phenomena by using statistical analyses: Concordances, collocations and frequencies have great influence in making linguistic researches more available, more adequate and more accurate.
منابع مشابه
Lexical Semantic Techniques for Corpus Analysis
In this paper we outline a research program for computational linguistics, making extensive use of text corpora. We demonstrate how a semantic framework for lexical knowledge can suggest richer relationships among words in text beyond that of simple co-occurrence. The work suggests how linguistic phenomena such as metonymy and polysemy might be exploitable for semantic tagging of lexical items....
متن کاملSoftware Tools for Morphological Tagging of Zulu Corpora and Lexicon Development
The aim of this paper is to discuss aspects of an on-going project on the development of grammatical and lexical resources for Zulu with sufficient coverage for unrestricted text. We explain how the basic software tools of computational morphology are used in linguistic processing, more specifically for automatic word form recognition and morphological tagging of the growing stock of electronic...
متن کاملEnhancing Access to Media Collections and Archives Using Computational Linguistic Tools
In this paper, we outline the strategies, methodology, and infrastructure needed to bring advanced computational linguistic tools to researchers and archivists in the humanities. We discuss three use cases involving the application of the Language Application Grid (LAPPS), an open, web-based infrastructure providing interoperable access to hundreds of computational linguistic (CL) component web...
متن کاملThe Lácio-Web: Corpora and Tools to Advance Brazilian Portuguese Language Investigations and Computational Linguistic Tools
In this paper we discuss the five requirements for building large publicly available corpora which geared the construction of the LácioWeb corpora and their environments: 1) a comprehensive text typology; 2) text copyright clearance, compilation and annotation scheme; 3) a friendly and didactic interface; 4) the need to serve as support for several types of research; 5) the need to offer an arr...
متن کاملDeveloping linguistic theories using annotated corpora
This paper aims to carve out a place for corpus research within theoretical linguistics and psycholinguistics. We argue that annotated corpora naturally complement native speaker intuitions and controlled psycholinguistic methods and thus can be powerful tools for developing and evaluating linguistic theories. We also review basic methods and best practices for moving from corpus annotations to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009